How to Build a Digital Virtual Assistant in Python

Blog

How to Build a Digital Virtual Assistant in Python

In this blog: Learn how to build a personal assistant using Python, for your application. Download the Virtual Assistant runtime environment and build your very own digital virtual assistant.

This tutorial will walk you through the basics of building your own digital virtual assistant in Python, complete with voice activation plus response to a few basic inquiries. From there, you can customize it to perform whatever tasks you need most.

The rise of automation, along with increased computational power, novel application of statistical algorithms, and improved accessibility to data, have resulted in the birth of the personal digital assistant market, popularly represented by Apple’s Siri, Microsoft’s Cortana, Google’s Google Assistant, and Amazon’s Alexa.

While each assistant may specialize in slightly different tasks, they all seek to make the user’s life easier through verbal interactions so you don’t have to search out a keyboard to find answers to questions like “What’s the weather today?” or “Where is Switzerland?”. Despite the inherent “cool” factor that comes with using a digital assistant, you may find that the aforementioned digital assistants don’t cater to your specific needs. Fortunately, it’s relatively easy to build your own.

Digital Virtual Assistant in Python – Step 1: Installing Python

To follow along with the code in this tutorial, you’ll need to have a recent version of Python installed. I’ll be using ActivePython, for which you have two choices:

Download and install the pre-built Virtual Assistant runtime environment (including Python 3.6) for Windows 10, macOS, CentOS 7, or…
Build your own custom Python runtime with just the packages you’ll need for this project, by creating a free ActiveState Platform account, after which you will see the following image:

How to install Python Packages with the ActiveState Platform

Click the Get Started button and choose Python 3.6 and the OS you’re working in. In addition to the standard packages included in ActivePython, we’ll need to add a few third party packages, including something that can do speech recognition, convert text to speech and playback audio:

Speech Recognition Package – when you voice a question, we’ll need something that can capture it. The SpeechRecognition package allows Python to access audio from your machine’s microphone, transcribe audio, save audio to an audio file, and other similar tasks.
Text to Speech Package – our assistant will need to convert your voiced question to a text one. And then, once the assistant looks up an answer online, it will need to convert the response into a voiceable phrase. For this purpose, we’ll use the gTTS package (Google Text-to-Speech). This package interfaces with Google Translate’s API. More information can be found here.
Audio Playback Package – All that’s left is to give voice to the answer. The mpyg321 package allows for Python to play MP3 files.

Once the runtime builds, you can download the State Tool and use it to install your runtime into a virtual environment.

And that’s it! You now have Python installed, as well as everything you need to build the sample application. In doing so, ActiveState takes the (sometimes frustrating) environment setup and dependency resolution portion out of your hands, allowing you to focus on actual development.

All the code used in this tutorial can be found in my Github repo.

All set? Let’s go.

Digital Virtual Assistant in Python – Step 2: Voice Input

The first step in creating your own personal digital assistant is establishing voice communication. We’ll create two functions using the libraries we just installed: one for listening and another for responding. Before we do so, let’s import the libraries we installed, along with a few of the standard Python libraries:

import speech_recognition as sr
from time import ctime
import time
import os
from gtts import gTTS
import requests, json

Now let’s define a function called listen. This uses the SpeechRecognition library to activate your machine’s microphone, and then converts the audio to text in the form of a string. I find it reassuring to print out a statement when the microphone has been activated, as well as the stated text that the microphone hears, so we know it’s working properly. I also include conditionals to cover common errors that may occur if there’s too much background noise, or if the request to the Google Cloud Speech API fails.

def listen():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print("I am listening...")
        audio = r.listen(source)
    data = ""
    try:
        data = r.recognize_google(audio)
        print("You said: " + data)
    except sr.UnknownValueError:
        print("Google Speech Recognition did not understand audio")
    except sr.RequestError as e:
        print("Request Failed; {0}".format(e))
    return data

For the voice response, we’ll use the gTTS library. We’ll define a function respond that takes a string input, prints it, then converts the string to an audio file. This audio file is saved to the local directory and then played by your operating system.

def respond(audioString):
    print(audioString)
    tts = gTTS(text=audioString, lang='en')
    tts.save("speech.mp3")
    os.system("mpg321 speech.mp3")

The listen and respond functions establish the most important aspects of a digital virtual assistant: the verbal interaction. Now that we’ve got the basic building blocks in place, we can build our digital assistant and add in some basic features.

Digital Virtual Assistant in Python – Step 3: Voiced Responses

To construct our digital assistant, we’ll define another function called digital_assistant and provide it with a couple of basic responses:

def digital_assistant(data):
    if "how are you" in data:
        listening = True
        respond("I am well")
    if "what time is it" in data:
        listening = True
        respond(ctime())
       
    if "stop listening" in data:
        listening = False
        print('Listening stopped')
        return listening
    return listening

This function takes whatever phrase the listen function outputs as an input, and checks what was said. We can use a series of if statements to understand the voice query and output the appropriate response. To make our assistant seem more human, the first thing we’ll add is a response to the question “How are you?” Feel free to change the response to your liking.

The second basic feature included is the ability to respond with the current time. This is done with the ctime function from the time package.

I also build in a “stop listening” command to terminate the digital assistant. The listening variable is a Boolean that is set to True when the digital assistant is active, and False when not. To test it out, we can write the following Python script, which includes all the previously defined functions and imported packages:

time.sleep(2)
respond("Hi Dante, what can I do for you?")
listening = True
while listening == True:
    data = listen()
    listening = digital_assistant(data)

Save the script as digital_assistant.py. Before we run the script via the command prompt, let’s check that ActiveState Python is running correctly by entering the following on the command line:

$ Python3.6

If ActivePython installed correctly, you should obtain an output that looks like this:

ActivePython 3.6.6.3606 (ActiveState Software Inc.) based on
Python 3.6.6 (default, Dec 19 2018, 08:04:03)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

Note that if you have other versions of Python already installed on your machine, ActivePython may not be the default version. For instructions on how to make it the default for your operating system, ActiveState provides procedures here.

With ActivePython now the default version, we can run the script at the command prompt using:

$ Python3.6 digital_assistant.py

You should see and hear the output:

Hi Dante, what can I do for you?
I am listening...

Now you can respond with one of the three possibilities we defined in the digital_assistant function, and it will respond appropriately. Cool, right?

How to Create a Digital Assistant Google Maps Query

I find myself frequently wondering where a certain city or country is with respect to the rest of the world. Typically this means I open a new tab in my browser and search for it on Google Maps. Naturally, if my new digital assistant could do this for me, it would save me the trouble.

To implement this feature, we’ll add a new if statement into our digital_assistant function:

def digital_assistant(data):
    if "how are you" in data:
        listening = True
        respond("I am well")
    if "what time is it" in data:
        listening = True
        respond(ctime())
    if "where is" in data:
        listening = True
        data = data.split(" ")
        location_url = "https://www.google.com/maps/place/" + str(data[2])
        respond("Hold on Dante, I will show you where " + data[2] + " is.")
        maps_arg = '/usr/bin/open -a "/Applications/Google Chrome.app" ' + location_url
        os.system(maps_arg)
       
    if "stop listening" in data:
        listening = False
        print('Listening stopped')
        return listening
    return listening

The new if statement picks up if you say “where is” in the voice query, and appends the next word to a Google Maps URL. The assistant replies, and a command is issued to the operating system to open Chrome with the given URL. Google Maps will open in your Chrome browser and display the city or country you inquired about. If you have a different web browser, or your applications are in a different location, adapt the command string accordingly.

How to Create a Digital Assistant Weather Query

If you live in a place where the weather can change on a dime, you may find yourself searching for the weather every morning to ensure that you are adequately equipped before leaving the house. This can eat up significant time in the morning, especially if you do it every day, when the time can be better spent taking care of other things.

To implement this within our digital assistant, we’ll add another if statement that recognizes the phrase “What is the weather in..?”

def digital_assistant(data):
    global listening
    if "how are you" in data:
        listening = True
        respond("I am well")
    if "what time is it" in data:
        listening = True
        respond(ctime())
    if "where is" in data:
        listening = True
        data = data.split(" ")
        location_url = "https://www.google.com/maps/place/" + str(data[2])
        respond("Hold on Dante, I will show you where " + data[2] + " is.")
        maps_arg = '/usr/bin/open -a "/Applications/Google Chrome.app" ' + location_url
        os.system(maps_arg)
       
    if "what is the weather in" in data:
        listening = True
        api_key = "Your_API_key"
        weather_url = "http://api.openweathermap.org/data/2.5/weather?"
        data = data.split(" ")
        location = str(data[5])
        url = weather_url + "appid=" + api_key + "&q=" + location
        js = requests.get(url).json()
        if js["cod"] != "404":
            weather = js["main"]
            temp = weather["temp"]
            hum = weather["humidity"]
            desc = js["weather"][0]["description"]
            resp_string = " The temperature in Kelvin is " + str(temp) + " The humidity is " + str(hum) + " and The weather description is "+ str(desc)
            respond(resp_string)
        else:
            respond("City Not Found")
    if "stop listening" in data:
        listening = False
        print('Listening stopped')
    return listeningtime.sleep(2)
respond("Hi Dante, what can I do for you?")
listening = True
while listening == True:
    data = listen()
    listening = digital_assistant(data)

For the weather query to function, it needs a valid API key to obtain the weather data. To get one, go here and then replace Your_API_key with the actual value. Once we concatenate the URL string, we’ll use the requests package to connect with the OpenWeather API. This allows Python to obtain the weather data for the input city, and after some parsing, extract the relevant information.

Customizing your Digital Virtual Assistant in Python

There is a myriad of digital assistants currently on the market, including:

Google Assistant and Siri, which focus primarily on helping users with non-work related tasks like leisure and fitness.
Cortana, which focuses on work efficiency.
Alexa, which is more concerned with retail.

With modest expectations in mind, each does its job relatively well. If you require more specificity, designing your own digital assistant is far from a pipe dream. Recent advances in speech recognition and converting text to speech make it viable even for hobbyists. And working in Python greatly simplifies the task, giving you the ability to make any number of customization to tailor your assistant to your needs.

Sign up for a free ActiveState Platform account so you can download the Virtual Assistant runtime environment and build your very own digital virtual assistant.

Related Blogs:

How To Build a Recommendation Engine in Python

Customer Support Chatbots: Easier & More Effective Than You Think

Frequently Asked Questions

How do I create a Python virtual assistant?

As this article explains, to create a Python virtual assistant all you need to do is:

Install the pre-built Virtual Assistant Python environment.
Create two voice communication functions, one for listening and another for responding.
Create voiced responses to the kinds of questions you want your assistant to answer.
Create the lookup routines the virtual assistant will perform when asked a specific question.

That’s it! Just follow the included code samples to make your assistant as simple or complicated as you like.

Create a free ActiveState Platform account so you can download the Virtual Assistant Python environment.

What can a Python virtual assistant do?

If you’re creating your own Python virtual assistant, you can define just about any question and answer you want. Most commonly, virtual assistants provide:

Current time
Local weather
Song/podcast playback
Turn on smart appliances/lights
Phone calling
Text writing
News updates
Internet searches

Learn more in the article above, or learn how to build a recommendation engine in Python.

How do I get my own Python desktop assistant?

Desktop assistants such as Siri and Cortana are available as a feature on modern Mac and Windows computers, but they’re not written in Python. The best way to get a Python virtual assistant for your computer desktop is to build your own.

Install the pre-built Virtual Assistant Python environment and follow the instructions in this post to build your own Python virtual assistant.

Why should I use Python to build a virtual assistant?

Python is well known for its readability and ease of use, which can make creating a virtual assistant a simple task even beginners can tackle.

Install our pre-built Virtual Assistant Python environment and follow the instructions in this post to build your own Python virtual assistant.

Overview

Community Forum

Service Status

Featured

Introducing ActiveState’s Secure, Custom Container Images

Read post

Featured

Introducing ActiveState’s Secure, Custom Container Images

Read post

Vulnerability Blast Radius

Risk Prioritization Copilot

Precision Remediation Pipeline

USE CASES

Container Security

Vulnerability Management and Remediation

Software Supply Chain Security

Compliance and SBOM

Beyond End-of-Life Support

RESOURCES

Read

Watch

Attend

FEATURED

Introducing ActiveState’s Secure, Custom Container Images

Read More

FEATURED

Introducing ActiveState’s Secure, Custom Container Images

Read More

Docs

Support Overview

Community Forum

Service Status

Login